Comprehensive analysis of differentially expressed miRNAs in hepatocellular carcinoma: Prognostic, predictive significance and pathway insights

Robust prognostic and predictive factors for hepatocellular carcinoma, a leading cause of cancer-related deaths worldwide, have not yet been identified. Previous studies have identified potential HCC determinants such as genetic mutations, epigenetic alterations, and pathway dysregulation. However, the clinical significance of these molecular alterations remains elusive. MicroRNAs are major regulators of protein expression. MiRNA functions are frequently altered in cancer. In this study, we aimed to explore the prognostic value of differentially expressed miRNAs in HCC, to elucidate their associated pathways and their impact on treatment response. To this aim, bioinformatics techniques and clinical dataset analyses were employed to identify differentially expressed miRNAs in HCC compared to normal hepatic tissue. We validated known associations and identified a novel miRNA signature with potential prognostic significance. Our comprehensive analysis identified new miRNA-targeted pathways and showed that some of these protein coding genes predict HCC patients’ response to the tyrosine kinase inhibitor sorafenib.


Introduction
Primary liver cancer is the second most common cause of cancer mortality in the world [1].Hepatocellular carcinoma (HCC) is the most common histology identified in primary liver cancers.HCC prognosis is primarily determined by stage, liver function, patient conditions and, indirectly, by HCC aetiology [2].The five-year survival ranges from more than 50% in operable diseases to less than 5% in the most advanced stages [3].This is due to the lack of effective treatments, especially in the setting of high tumour burden, where systemic treatments can lead to a median survival of less than two years.Despite extensive molecular studies which have identified potential determinants among genetic mutations, epigenetic alterations and pathway dysregulation [4], HCC pathogenesis is not entirely elucidated.A better understanding of the molecular processes driving HCC could lead to more effective treatments.
MicroRNAs (miRNAs) are short non-coding RNAs that regulate mRNA translation and stability [5].The human genome encodes approximately 2000 miRNAs, but each of these transcripts can regulate the expression of hundreds of mRNAs.It has been estimated that miRNAs regulate the expression of approximately 60% of the protein-coding genes.The expression of miRNAs is deregulated in many cancers [6], making them attractive biomarkers and therapeutic targets.The role of specific miRNAs in HCC has been studied [7], but there are only limited experiences on comprehensive studies assessing the prognostic function of miRNAs in HCC and the downstream pathways in this malignancy [8][9][10] leading to the identification of nonunivocal signatures.
In this study, we used a combination of bioinformatic techniques and clinical dataset analyses to identify miRNAs that are differentially expressed in HCC versus normal hepatic tissue and whose expression is associated with HCC prognosis.This approach allowed us to confirm known associations, identify several uncharacterised miRNAs and propose new potential targeting pathways with potential clinical relevance.

Differential expression analysis
A data-led approach was chosen over a literature-based candidate gene-identifying approach for the sake of analysis comprehensiveness.OncoMir Cancer Database (OMCD) [11] was used to access data on miRNA expression from Liver hepatocellular carcinoma (referred to as LIHC) dataset.LIHC dataset comprises The Cancer Genome Atlas (TCGA) data from 426 samples, of which 375 HCC patients and 51 non-HCC liver tissue.Differential expression between HCC and non-HCC liver tissue was tested via t-test on OMCD portal (Bonferroniadjusted p values of < 0.05) and miRNAs whose expression in HCC was at least 2 folds different (ratios > 2 or < 0.5) from non-HCC were retained for further analyses.

Survival analysis
The shortlisted differentially expressed miRNAs were cross-referenced with miRNAs RNAseq TCGA dataset provided by Kaplan-Meier Plotter (KMP) tool [12].Only RNA-seq data from TCGA samples of patients who had not undergone any other treatment (either radio-or chemotherapy or systemic drugs) than surgery was comprised in the survival analysis.Patients were dichotomized according to each differentially expressed miRNA median values, into high-and low-expression cohorts.Overall survival (OS) was chosen as the prognostic endpoint of interest and compared among high-and low-expression patients through log-rank test pvalue.Each comparison is presented as median OS (mOS), Hazard Ratio (HR) with 95% Interval of Confidence (95%IC) and p-value.

Construction of a prognostic miRNA signature
MiRNAs identified in 2.2 bearing a significant prognostic value were used to generate a prognostic signature through miRpower tool in the KMP."Multiple genes" and "mean gene expression" options were set; patients were dichotomized according to the median value of the signature and the OS of the two cohorts was compared via log-rank test.

Pathway analysis
Pathway analysis was conducted to explore the mechanisms underpinning the prognostic effects of these differentially expressed miRNAs.This involved the identification of target genes for up-and down-regulated miRNAs, and KEGG pathway analysis using the Over Representation Analysis (ORA) method.

Selection of mature miRNAs.
The TCGA miRNA expression data used by the OMCD and KM Plotter databases do not differentiate between mature 3p and 5p miRNAs [13].Both miRNA forms are capable of binding to target mRNAs through a sequence-specific base pairing mechanism and regulate gene expression by interacting with distinct sets of target genes.However, only one mature strand is typically incorporated into the RNA-induced silencing complex (RISC) and involved in regulating mRNA transcription.Thus, for each miRNA, it was necessary to determine whether the 3p or 5p form should be used for the analysis of downstream target genes and pathways.The predominant mature form of each miRNA was selected based on the number of experimental reads in the miRBase database [14] (S1 Table ).The hsa-miR-3653 identifier was subsequently excluded as the available experimental data no longer supports its annotation as a miRNA [14].
2.4.2Target gene identification.The first step in conducting the pathway analysis was to choose a database that provides miRNA-target interaction data.MirTarBase is a manually curated database of experimentally validated miRNA-target interactions (MTI) collated from reporter assay, western blot and next generation sequencing (NGS) studies [15].The database includes more than 360,000 MTIs and represents the largest and most up to date source of experimentally validated MTI data.MirTarBase was initially selected due to its size and rigorous inclusion criteria.An alternative MTI database, miRPathDB was found to include the latest version mirTarBase data and was selected instead.The prognostic mature miRNA identifiers were queried in the mirRPathDB database, retrieving a comprehensive list of target genes.This list was cross-validated using the miRTargetLink 2.0 database which provided a second mirror of the most recent mirTarBase data, confirming a 100% match between the predictions provided by the two databases.
A Python script was written to remove duplicates and genes with no experimental evidence of miRNA-interaction.Two genes could not be mapped to current Ensembl/Entrez IDs and were excluded from further analysis.The remaining target genes were relevant and aligned with the objective of the analysis.
2.4.3Functional annotation and enrichment analysis.Functional annotation and enrichment analysis was conducted using the Database for Annotation, Visualization and Integrated Discovery (DAVID) tool.Alternative online functional enrichment tools WebGestalt and g:Profiler were evaluated, but DAVID was selected because it is highly cited, regularly updated, and well documented [16].The target gene lists for up-regulated miRNAs and downregulated miRNAs were uploaded to DAVID which identified them as being for the Homo Sapiens species.Functional annotation enrichment was performed for targets of upregulated miRNAs and downregulated miRNAs separately, selecting only the KEGG pathway database option.Statistical analysis tested the hypothesis that the overlap between genes in the target gene list and a given pathway or functional category was greater than expected by chance.The statistical test used was Fisher's exact test and Benjamini-Hochberg correction was applied to p-values to correct for multiple tests.Text-based result files for each gene list were downloaded from DAVID.A Python script was written to filter pathways with a Benjamini-Hochberg significance threshold of � 0.05.

Analysis of enriched KEGG pathways.
To focus on the most relevant pathways, KEGG analysis results were filtered to include only pathways that are putatively involved in HCC.Relevant pathways were defined as those listed as "related" in the KEGG HCC Pathway (hsa05225) [17].A Python script was used to filter the KEGG results according to a predefined set of keywords derived from these maps.The keyword list used to filter results was: Hepatocellular, Hepatitis, Alcohol, NAFLD, PI3K, AKT, P53, WNT, MAPK, Cell cycle, calcium signaling, TGF-beta.

Validation of the functional enrichment analysis.
A validation of the DAVID functional enrichment analysis was performed using the WebGeSTALT tool, accessed in February 2024 [18].

Identifying potential miRNA interactions with cancer driver genes.
A previously reported pan-cancer analysis of more than 9000 tumor exomes from 33 TCGA projects identified 299 oncogenic driver genes and 3473 associated driver missense mutations [19].Subsequent functional validation utilizing an independent dataset of experimentally validated mutations indicated that 60-85% of the predicted mutations were likely to be cancer drivers.A list of 299 HCC and pan-cancer associated genes was retrieved from TCGA and a Python script was used to return intersection between the prognostic miRNA target genes and the 299 putative cancer driver genes.

Some differentially expressed miRNAs are prognostic in resected HCC
Using OMCD, levels of expression of miRNAs in HCC versus healthy liver tissue were compared.Out of the 106 significantly differentially expressed miRNAs, 59 had a significant, higher than 2-fold differential expression (HCC/healthy liver either >2 or <0.5), compared to normal liver: respectively, 23 were increased and 36 decreased in HCC (S2 Table ).Forty of the differentially expressed miRNAs were also present in the KMP RNA-seq dataset which we have employed for prognostic stratification (21 of which up-regulated and 19 down-regulated in HCC versus healthy liver).
In keeping with our hypothesis, we found that for the majority of miRNAs, expression in HCC versus normal tissue and prognostic significance were highly correlated.All the miRNAs up-regulated in HCC were associated with negative prognosis, 4 out of 5 miRNAs that are down-regulated in HCC were associated with better prognosis.Hsa-miR-326 was the only transcript decreased in HCC showing an unexpected negative prognostic role (HR:1.97(1.37-2.83),p 0.0002).
To increase the robustness of our analysis, a 9-miRNA prognostic signature was generated from prognostically relevant miRNAs (this signature excluded Hsa-miR-326, which showed inconsistent correlations between prognosis and expression patterns in normal vs neoplastic liver tissues).The 9-miRNA signature yielded an HR for OS of 2.15 The prognostic significance in terms of OS of each miRNA belonging to the signature is plotted separately in S2 Fig.Taken together, these results suggest that the 5 miRNAs up-regulated in HCC and associated with worse prognosis may be oncogenic, whilst the 4 miRNAs that are down-regulated in HCC and associated with better prognosis may act as tumour suppressors.Based on these criteria, we could not predict a mechanistic role for Hsa-miR-326.For this reason, we decided to also study the functional pathways associated with this peculiar transcript.

Functional pathways of miRNA-targeted genes
To understand which molecular pathways are modulated by differential miRNA expression in HCC, we used a dataset of validated miRNA/mRNA target pairs.This enabled us to identify potential functional differences between the two types of miRNAs.A total of 982 unique target protein-coding genes for the 10 miRNAs were retrieved from MiRPathDB.Among these, 430 genes were targeted by up-regulated miRNAs, while 585 genes were targeted by down-regulated ones.We then ran pathway analyses on the two separate lists of protein coding genes.
Pathway analysis on target genes was conducted with DAVID and Webgestalt.Target genes of up-regulated miRNAs were significantly enriched in 7 (DAVID) and 1 (Webgestalt) KEGG pathways.The associated pathways were mainly implicated in neurodegenerative diseases.After filtering for HCC-specific pathways, cell cycle was the only pathway associated with targets of up-regulated miRNAs in both DAVID and Webgestalt tools (Table 2).Targets of down-regulated miRNAs were significantly enriched in 93 (DAVID) and 94 (Webgestalt) KEGG pathways relating to numerous cancers, cancer related processes and associated signalling pathways.Ten HCC-specific pathways were significantly enriched (p<0.05),including HCC, Hepatitis B, MAPK signalling pathway, PI3K-Akt signalling pathway, Cell cycle, Hepatitis C, TGF-beta signalling and p53 signalling pathway (Table 3).
A graphical representation of the results of the above-mentioned enrichment analyses is presented in Fig 2 .A Comprehensive overview of KEGG HCC-specific pathways can be found in S1 Fig.
The process of intersecting the miRNA target genes with the HCC and pan cancer associated genes revealed that the list of 982 miRNA target genes contained 10 HCC driver genes and 11 pan-cancer driver genes (Table 4).
To further validate the clinical significance of our findings, we studied the predictive value of HCC-specific miRNA-targeted protein coding genes (Table 4) in a small but well characterised cohort of HCC patients treated with sorafenib.Notably, higher expression of 3 miRNAtargeted genes (RB1, NOTCH2, PIK3CA) was associated with better survival (Fig 3A -3C).In  each case, higher expression of the protein coding genes was a positive predictive factor (longer survival in sorafenib-treated patients).The three protein coding genes are targeted by 3 to 5 of our investigated miRNAs (Table 4, fourth column).Notably, the only miRNA that targets all these three protein-coding genes is Hsa-miR-326, which is down regulated in HCC, but whose higher expression is associated with worse prognosis.Moreover, the combined three-gene signature achieved an even greater predictive significance (Fig 3D).Thus, there is experimental evidence that the investigated miRNAs may also directly interact with 21 putative HCC and pan-cancer driver genes.

Discussion
In this study, we have identified a 9-miRNA prognostic signature for HCC.The proposed signature is consistently significant across different clinically relevant HCC sub-groups (e.g.ethnicity, nuclear differentiation grade) and seems to have a prominent prognostic role among Asian patients, possibly due to more frequent HBV-aetiology in this ethnic group [20].In keeping with this hypothesis, HBV-aetiology is the second most significant HCC-pathway affected by our miRNA signature (Fig 2).If confirmed in clinical datasets, this signature could enable patient stratification and treatment optimisation, especially for low-and intermediaterisk diseases.At these stages, several therapeutic options are available, but currently these treatments cannot be tailored based on individual molecular profiles.Interestingly, some miRNAs are well detectable in biological fluids, such as plasma [21].If the 9-miRNA signature proves to be detectable in plasma samples in the HCC population, it can be used as a minimally invasive biomarker for patient stratification and dynamic monitoring.
Several experimental studies have dissected the pathogenic role of specific miRNAs in HCC.Our results are in line with mechanistical evidence.Of the five up-regulated/negatively prognostic miRNAs, four have been shown to act as oncogenes in HCC cells: mir-501 by promoting epithelial to mesenchymal transition [22]; mir-3127 and mir-1180 by promoting proliferation and tumorigenesis [23,24]; mir-3677 by promoting proliferation and drug resistance [25].While the clinical significance of some of these transcripts has been described, to the best of our knowledge this is the first study showing that mir-3127 has a prognostic impact on HCC.Notably, our pathway analysis has identified cell cycle regulation as the main molecular pathway controlled by these miRNAs, further confirming the link between bioinformatic and experimental findings.
Our results on mir-877 are apparently in contrast with previous reports, showing that this transcript inhibits HCC proliferation [26].We found that this miRNA is up-regulated in HCC versus normal tissue, and that higher expression of this miRNA predicts poor prognosis.Both results imply an oncogenic function.Due to these coherent results, we decided to maintain mir-877 in our prognostic signature.Notably, another study showed that a genetic variant of this miRNA predicts HCC risk [27].It is not known if this genetic variant has a prognostic role in advanced disease settings.It is therefore conceivable that this miRNA could play different roles at different stages of disease progression (e.g.onco-suppressive for early stage, oncogenic for advanced stage), as demonstrated for other cancer-related genes [28].In light of the partially conflicting evidence, further studies are warranted to dissect the role of this miRNA at different stages of HCC progression, and to test its prognostic role in other clinical datasets.
In parallel with the results on putative oncogenic miRNAs, our results on putative tumour suppressors agree with pre-clinical results.Mir-3653, mir-326 and mir-139 have been shown to inhibit metastasis and progression of HCC cells [29][30][31].Similarly, mir-145 has been implicated in G2/M phase arrest by inhibiting cyclin B1 [32].Finally, a recent study has shown that mir-99a is a direct target of the epigenetic silencer EZH2 [33].EZH2 promotes cell proliferation, metastasis and drug resistance in several cancers by silencing the expression of tumour suppressors [34].We have shown that higher EZH2 activity (measured via a liquid biopsy approach) predicts poorer prognosis in HCC patients treated with sorafenib [35].Hence our finding on mir-99a paves the way for the combined use of miRNA and epigenetic biomarkers for patient stratification and drug efficacy prediction in HCC.
Other recent manuscripts have investigated miRNA signatures in HCC.For example, Sathipati et al identify a 23-miRNA signature associated with stage [10].Notably, 7 of these miRNAs had a prognostic role, based on the analysis of a dataset of 166 HCC samples.We acknowledge that some aspects of the methodology overlap with our study.However, there are also substantial differences between the two studies.Sathipati et al have used machine learning for miRNA selection, focussing on transcript expression at different cancer stages, whilst we have manually filtered the transcripts based on differential expression in HCC vs normal tissue and we have subsequently refined our signature based on prognostic value.Importantly, we have validated our prognostic signature in a larger dataset containing data from 372 HCC samples.Our 9-miRNA signature does not contain any of the 7 miRNAs identified by Sathipati et al.This divergence is attributable to the substantially different methodologies of the two studies.Interestingly 3 out of 5 HCC-specific upregulated miRNAs have been measured in biological fluids from cancer patients [36][37][38].This result suggests the potential clinical relevance of our miRNA signature to guide clinical diagnosis and prognosis.
We also investigated the predictive role of miRNA-modulated pathways in a curated cohort of HCC patients exposed to the tyrosine kinase inhibitor sorafenib.Sorafenib is employed for the treatment of advanced HCC as second line treatment after progression on first-line immunotherapy-based strategies, or as front-line choice for patients with absolute contraindications to combination strategies.Our results showed that higher expression of three target genes was associated with better response to the treatment.Notably, the only miRNA that targets all these three protein-coding genes is mir-326, which is down regulated in HCC, but whose higher expression is associated with worse prognosis.Because of this intriguing inconsistence, we decided to perform pathway analyses on this transcript.Based on our results, it is conceivable that this miRNA could play different roles at different stages of HCC pathogenesis.Whilst inhibiting early tumorigenic events, mir-326 could suppress the expression of therapy-sensitivity genes (NOTCH, RB1, PIK3CA) thereby conferring sorafenib resistance to HCC cells.These findings highlight the potential predictive role of the miRNA signature and of the associated target mRNAs.
In conclusion, we have identified for the first time a miRNA signature that is differentially expressed in HCC and significantly predicts prognosis.This signature includes transcripts whose function has been independently validated in pre-clinical studies.Future studies on clinical samples (primary tumours and plasma) are warranted to confirm the clinical usefulness of this potential new biomarker panel.

Fig 2 .
Fig 2. Enrichment analysis for miRNA-targeted genes in KEGG pathways.The shared x-axis represents fold enrichment and False Discovery Rates (FDR) for both plots.All listed KEGG pathways represent the enrichment of down-regulated miRNAs (left plot) and up-regulated miRNAs (right plot) targeting genes in different pathways.https://doi.org/10.1371/journal.pone.0296198.g002